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SUMMARY 


A.  Problem 

Because  of  the  need  for  military  personnel  competent  in  foreign  language 
skills,  a  program  of  research  has  been  initiated  for  the  Defense  Language 
Institute  (DLI)  to  develop  tests  and  other  procedures  for  improving  selection 
of  language  trainees  capable  of  high  levels  of  language  achievement. 

B.  Background 

Currently  selection  of  students  is  primarily  based  on  the  Foreign 
Language  Aptitude  Test  (FLAT).  Prior  research,  however,  has  demonstrated 
the  importance  of  including  both  motivation  and  aptitude  tests  in  predicting 
foreign  language  achievement.  Since  present  DLI  selection  procedures  do  not 
include  systematic  measurement  of  trainee  motivation,  it  was  important  that 
non-cognitive  measures  be  considered  for  selecting  foreign  language  students. 

C .  Approach 

In  addition  to  obtaining  FLAT  scores,  several  measures  such  as  the  Per¬ 
sonal  Data  Questionnaire,  the  Navy  Adjective  List,  and  Instructor  Ratings 
were  gathered  experimentally  at  the  Defense  Language  Institute  West  Coast 
for  validation  as  predictors  of  final  class  standing.  Keys  were  empirically 
developed  for  the  experimental  tests  in  part  of  the  sample  and  validated  on 
the  remainder.  Multiple  regression  techniques  were  used  to  determine  the 
best  combinations  of  predictors.  Where  data  were  available,  important 
findings  were  replicated  on  a  small  sample  from  the  Defense  Language 
Institute  East  Coast. 

D.  Findings ,  Conclusions ,  and  Recommendations 

The  major  finding  of  this  research  is  that  prediction  of  language 
achievement  can  be  markedly  improved  by  an  instructor's  rating  obtained  at 
the  end  of  only  one  week  of  instruction--or  even  one  day  if  need  be.  (Pages 
5,  7).  If  "trial  training"  were  implemented  for  the  purpose  of  obtaining 
Instructor's  Ratings  prior  to  the  inception  of  formal  training,  considerable 
expense  may  be  avoided  by  eliminating  those  students  considered  substandard 
by  the  instructors. 

If  brief  trial  training  proves  to  be  infeasible  and  an  ample  number  of 
potential  trainees  are  available,  some  improvement  may  also  be  achieved  if 
selection  were  based  on  paper  and  pencil  tests,  i.e.,  a  combination  of  the 
Personal  Data  Questionnaire  and  Foreign  Language  Aptitude  Test  scores. 

(Page  10) 

If  either  or  both  of  the  two  procedures  (i.e.,  paper  and  pencil  tests 
and  instructor's  rating)  are  adopted  for  operational  use,  follow-up  research 
is  recommended  on  a  larger  sample  of  Na'vy  personnel  to  improve  the  accuracy 
of  the  weights  and  cutting  scores  used,  since  the  DLI  included  members  of 
all  the  military  forces. 
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SELECTION  OF  MILITARY  PERSONNEL  FOR 
FOREIGN  LANGUAGE  TRAINING 


A.  BACKGROUND  AND  PURPOSE 


The  Foreign  Language  Aptitude  Test  (FLAT)  is  a  selection  test  having 
moderate  to  high  validity  for  predicting  success  in  foreign  language 
training.  The  test  requires  the  examinee  to  learn  the  vocabulary  of  an 
artificial  language  and  certain  grammatical  principles  of  the  artificial 
language,  all  of  which  are  applied  in  the  translation  of  sentences.  The 
FLAT  was  instituted  in  January  1963,  for  selecting  recruits  from  the  naval 
training  centers  for  foreign  language  training  at  the  Defense  Language 
Institute  (DLI). 

Even  though  prior  research  had  indicated  the  FLAT  alone  to  be  useful 
as  a  selection  instrument,  a  review  of  relevant  research  studies  demonstrated 
the  importance  of  including  motivation  as  well  as  aptitude  measures  for 
predicting  foreign  language  student  achievement.  Consequently,  an  extensive 
experimental  test  battery  was  given  to  students  at  the  DLI,  both  the  West 
Coast  (DLIWC)  and  the  East  Coast  (DLIEC)  Branches,  to  determine  the 
effectiveness  of  various  cognitive  and  noncognitive  measures  in  predicting 
success  in  foreign  language  training. 1  Comparisons  were  made  to  determine 
if  selection  using  FLAT  alone  could  be  improved  upon  through  the  addition 
of  (1)  an  instructor's  rating  obtained  at  the  end  of  the  first  day  and/or 
first  week  of  the  course,  (2)  interest  and  motivational  questionnaires 
empirically  keyed  to  predict  foreign  language  achievement,  and  (3)  other 
experimental  and  biographical  indices  such  as:  Pay  Grade,  Age,  Education 
Level,  and  Vocabulary  Learning  Test  scores.  A  detailed  description  of  the 
procedures  used  in  the  analysis  of  the  DLIWC  data  and  the  results  obtained 
were  presented  in  an  earlier  technical  report  (Neumann,  Abrahams  ^  Githens, 
1968) .  The  present  report  provides  the  DLIWC  findings  in  a  less  technical 
manner  and  presents  the  results  of  a  replication  of  the  relevant  findings 
on  the  DLIEC  sample. 


B.  POPULATION 


The  primary  population  studied  consisted  of  660  men  enrolled  in  a  wide 
variety  of  language  classes  at  the  DLIWC,  located  at  Monterey,  California. 
Due  to  the  relatively  small  sizes  of  the  individual  classes,  the  small 
proportion  of  naval  personnel  in  attendance,  and  the  need  for  sizeable 
groups  for  statistical  analysis,  the  sample  studied  also  included  Army, 

Air  Force,  and  Marine  Corps  students.  Army  personnel  made  up  the  largest 


^Dr.  Bob  D.  Rhea  served  as  Project  Director  during  the  early  stages  of 
the  study. 
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portion  of  the  sainple--472  men,  or  71.5  per  cent  of  the  total.  Navy  men 
constituted  13.9  per  cent  of  the  total;  the  Air  Force  contributed  11.8 
per  cent,  and  the  remaining  2.8  per  cent  came  from  the  Marine  Corps.  The 
majority  of  the  students,  88.8  per  cent,  were  from  the  enlisted  ranks, 
and  the  remainder  were  officers.  Age  ranged  from  17  to  51  years,  with  a 
median  of  20.  The  amount  of  formal  education  ranged  from  less  than  high 
school  graduation  to  the  completion  of  Master's  degree  requirements,  with 
a  median  of  two  years  of  college.  In  addition  to  this  DLIWC  group,  a 
smaller  sample  (N=129)  was  obtained  from  the  DLIEC,  located  at  Washington, 
D.  C.  for  replication  of  relevant  DLIWC  findings. 

C.  CRITERION 

Final  Class  Standing  (FCS) ,  adjusted  for  class  size,  was  used  as  the 
criterion  of  foreign  language  achievement.  Adjusting  for  class  size  made 
it  possible  to  combine  students  from  different  classes  and  languages  into 
"language  groups"  on  a  common  scale  to  reflect  each  student's  relative 
classroom  achievement. 


D.  PREDICTORS 

In  addition  to  FLAT,  the  operational  selection  test,  experimental 
predictors  were  assembled  from  other  tests  administered  after  selection  but 
prior  to  language  training,  ratings  secured  from  instructors  after  the 
first  week  of  class,  and  information  available  from  DLI  records.  Experimen¬ 
tal  predictors  were  classified  into  three  types.  Test  scores  based  on 
existing  scales  and  background  information  were  labeled  Set  I  predictors. 

The  second  category.  Set  II,  consisted  of  specially  constructed  empirical 
keys.  Instructor  ratings  made  up  the  third  category.  Set  III. 

1 .  Operational  Predictor:  Foreign  Language  Aptitude  Test  (FLAT) 

This  test  was  originally  called  the  Army  Language  Aptitude  Test  (ALAT) 
when  it  was  developed  by  Dorcus,  Mount,  and  Jones  in  1952.  Later  Army 
studies  determined  the  ALAT  to  be  of  operational  utility  (Berkhouse, 
Mendelson  ^  Kehr,  1959) .  The  FLAT  is  currently  used  as  an  aptitude 
screening  test  for  selection  to  DLI. 

2.  Experimental  Predictors 

a.  Set  The  Set  I  predictors  consist  of  scores  based  on  existing 
tests  and  background  information  records.  These  include  the  following 
measures : 

(1)  Insolence  Scale .  This  test  is  assumed  to  be  a  measure  of 
passive-aggressive  personality  structure  and  has  been  found  to  be  related 
to  the  job  performance  of  Navy  third  class  enlisted  men  (Kipnis,  1965). 

Two  subscales  have  been  developed  and  scores  on  both  were  obtained  to 
determine  their  effectiveness  in  predicting  foreign  language  achievement. 
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(2)  Hand  Skills  Test  (HST) .  The  HST  is  designed  to  measure 
motivation  by  testing  the  persistence  of  subjects  in  doing  a  simple  and 
monotonous  tally-marking  task  several  hundred  times  in  a  timed  situation 
(Kipnis  §  Glickman,  1958).  Scores  are  derived  by  subtracting  the  practice 
trial  score  from  the  final  trial  score. 

(3)  Education  Level .  The  amount  of  formal  education  completed  prior 
to  admission  to  the  DLI  was  employed  as  a  continuous  scale,  measured  in 
years.  It  ranged  from  less  than  high  school  graduation  to  the  completion 

of  graduate  college  degrees,  with  a  mean  education  level  of  two  years  of 
college. 


(4)  Pay  Grade.  Pay  Grade  is  a  code  that  is  uniformly  used  by  each 
of  the  services  to  reflect  salary  level.  All  grades  were  recoded  to  form 
a  continuous  measure.  Grades  El  through  E9  were  coded  1  through  9, 
respectively,  grades  W1  through  W4  as  10  through  13,  respectively,  and 
officer  grades  01  through  06  as  14  through  19,  respectively. 

(5)  Age.  The  school  input  age  ranged  from  17  to  51  years,  and  was 
used  as  a  continuous  variable. 

(6)  Vocabulary  Learning  Test.  This  test  consists  of  20  unusual 
English  words  and  their  definitions.  It  was  the  first  test  presented  in 
the  test  battery  at  the  DLI.  Ten  minutes  were  given  for  the  students  to 
learn  the  list.  At  the  end  of  the  test  battery,  only  the  definitions  were 
presented  and  the  students  were  given  five  minutes  to  supply  the  appropriate 
word  from  memory.  Three  experimental  scores  were  derived  as  a  means  of 
measuring  the  ability  to  recall  and  match  newly  presented  words  to  their 
meanings : 


(a)  The  number  of  accurately  spelled  words  recalled. 

(b)  The  number  of  words  for  which  at  least  the  first  two 
letters  were  correct. 

(c)  The  number  of  words  attempted. 

b.  Set  II.  This  predictor  set  consisted  of  two  questionnaires  which 
were  empiricaTly  keyed  to  predict  foreign  language  achievement.  The  primary 
criterion  of  foreign  language  achievement  was  PCS.  Two  other  less  relevant 
criteria  of  achievement  were  avail able- -the  Listening  Comprehension  (L/C) 
and  Reading  Comprehension  (R/C)  scores  of  the  Army  Language  Proficiency 
Test  (ALPT) .  The  construction  of  the  empirical  keys  to  predict  these 
criteria  was  discussed  in  the  technical  report  by  Neumann,  et  al.  (1968). 

(1)  Personal  Data  Questionnaire  (PDQ) .  The  190  items  of  the  PDQ 
include  biographical,  need  for  change,  acceptance  of  social  change,  and 
study  habit  subtests  which  have  been  related  to  foreign  language 
achievement  in  other  studies  (Altus,  1961;  Hebenstreit,  1959;  Heilbrun, 

1962;  Lambert,  undated;  Levy,  1962;  Maier,  1959;  Pimsleur,  1962;  Preston, 
1961)  . 
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(2)  Navy  Adjective  List  (NAL) .  "Need  for  achievement"  has  in  past 
research  been  found  to  be  related  to  school  success  (Barnette,  1961;  Bendig, 
1957,  1958;  Hebenstreit,  1959).  The  NAL  consists  of  103  adjectives,  many 
of  which  appear  relevant  to  the  "need  for  achievement"  concept. 

c.  Set  lIl--lnstructor's  first  week  ratings.  To  determine  whether 
instructors  can  accurately  predict  students'  ultimate  achievement  in 
language  school  from  performance  in  early  stages  of  instruction,  each 
language  instructor  at  DLIWC  rated  his  students  at  the  end  of  the  first 
week  of  class.  In  the  second  study,  conducted  at  DLIEC,  first  day  ratings 
were  also  obtained,  in  addition  to  the  first  week  ratings.  Ratings  were 
secured  on  seven-point  scales  to  estimate  a  student's  probable  degree  of 
language  "success,"  his  quality  of  "oral  production,"  and  his  "motivation" 
to  complete  language  training.  The  "oral  production"  rating  was  obtained 
in  response  to  the  suggestions  of  language  school  personnel  that  willingness 
to  vocalize  in  the  target  language  and  the  student's  correctness  of 
pronunciation  may  be  related  to  skill  in  acquiring  language  facility.  If 
ratings  obtained  early  in  training  were  valuable  in  predicting  achievement, 
attempts  may  be  made  to  secure  such  ratings  at  designated  centers  to  be 
used  in  the  selection  of  language  school  students  prior  to  transfer  to  the 
DLI. 


E .  PROCEDURE 


1 .  Language  Groups 

Insufficient  sample  size  for  any  single  language  necessitated  combining 
classes  into  language  groupings.  The  DLI  suggested  six  language  groups 
based  on  language  structure  and  grammar,  comparable  difficulty  in  acquiring 
a  vocabulary,  length  of  time  required  to  achieve  desired  proficiency  with 
the  language,  and  ability  required  to  discriminate  tonal  changes. 

Sufficient  data  were  available  to  analyze  the  following  three  of  the  six 
recommended  categories: 

a.  Indo-European  (Western) :  Albanian,  French,  German,  Greek,  Italian, 
Portuguese,  Romanian, .  and  Spanish. 

b.  Indo-European  (Eastern) :  Bulgarian,  Czechoslovakian,  Hungarian, 
Persian,  Polish,  Russian,  and  Serbo-Croatian. 

c.  Indo-Chinese:  Burmese,  Chinese,  Japanese,  Korean,  Malayan,  Thai, 
and  Vietnamese. 

2.  Criterion  Scores 

Prior  to  statistical  analyses  of  the  data,  it  was  necessary  to  make  the 
criterion  scores  comparable  for  the  various  language  classes  combined  into 
each  language  group.  By  assuming  that  an  individual's  class  standing  is 
not  influenced  by  course  length,  that  each  student's  interest  and  aptitude 
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were  not  dependent  upon  which  language  he  was  studying,  and  that  the 
distribution  of  student  aptitude  was  equal  between  language  classes  within 
a  group,  it  was  possible  to  include  all  classes  regardless  of  course  length 
or  content  in  each  language  group  for  the  prediction  of  PCS. 

3.  Statistical  Analyses 

a.  Key  construction.  Special  keys  were  built  for  the  PDQ  and  the  NAL 
to  predict  PCS.  Two-thirds  of  the  students  were  used  for  key  construction, 
and  the  remaining  randomly  selected  one-third  were  held  out  for  cross- 
validation.  These  samples  were  used  to  construct  and  validate  separate 
keys  for  each  language  group,  and  also  to  construct  and  validate  one 
general  key  for  the  three  combined  groups. 

b.  Multiple  regression.  Statistical  procedures  were  used  to  evaluate 
increases  in  validity  as  the  number  of  predictors  was  increased.  Multiple - 
regression  procedures  were  used  to  derive  four  equations  for  four  combinations 
of  predictor  sets  within  each  language  group.  These  equations  were  developed 
to  predict  PCS  from:  (1)  PLAT  and  Set  I,  (2)  PLAT,  Set  I,  and  Ratings, 

(3)  PLAT,  Set  I,  and  Set  II,  and  (4)  PLAT,  Set  I,  Set  II,  and  Ratings.  In 
addition,  general  equations  were  computed  for  the  three  language  groups 
combined. 


P.  RESULTS  AND  DISCUSSION 


1.  DLIWC  Results 

The  DLIWC  data  were  analyzed  through  multiple  regression  to  assess 
possible  increases  in  validity  for  four  combinations  of  predictor  sets. 

The  individual  predictor  validities  are  presented  in  Appendix  Table  A  for 
the  key  construction  and  cross-validation  samples.  The  complete  inter¬ 
correlation  matrices  for  each  of  the  samples  can  be  found  in  the  technical 
report  on  the  DLIWC  data  analysis  (Neumann,  et  al . ,  1968). 

The  equations  resulting  from  multiple-regression  analyses  are  presented 
in  Appendix  Tables  B  and  C.  They  were  evaluated  for  predictive  efficiency 
and  the  best  equations  selected  for  each  language,  one  including  instructor 
ratings  and  one  excluding  ratings.  The  validities  of  the  selected  equations 
are  presented  in  Table  1,  along  with  the  validities  of  PLAT  alone.  In  all 
instances,  prediction  of  foreign  language  achievement  can  be  improved  by 
applying  the  appropriate  composite  rather  than  just  using  PLAT  alone. 

Por  the  Indo-European  languages,  both  Western  and  Eastern,  the  use  of 
an  instructor's  rating  combined  with  PLAT  (equations  1  and  3)  shows  the 
largest  gain  in  validity,  43  and  18  correlation  points,  when  compared  with 
that  of  PLAT  alone  in  the  respective  language  groups .  However,  should  it 
not  be  administratively  feasible  to  obtain  first  week  instructor's  ratings, 
selection  of  more  successful  Indo-European  Western  and  Eastern  language 
students  can  still  be  effected  by  the  use  of  equations  2  and  4.  Por  the 
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Recommended  Multiple-Regression  Equations  for  the 
Prediction  of  Final  Class  Standing 
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Regression  weights  and  variables  were  determined  on  the  smaller  sample  and  cross-validated 
on  the  larger  sample,  due  to  constraints  in  the  use  of  the  data. 


Western  languages,  a  composite  score  for  selection  purposes  may  be  based  on 
a  weighted  combination  of  Education  Level  and  scores  on  the  empirically 
derived  PDQ  and  NAL  scales.  Similarly,  combining  a  Vocabulary  Learning 
Test  score  with  FLAT  shows  an  increase  in  validity  over  that  possible  with 
FLAT  alone  for  the  Eastern  language  group. 

For  the  Indo-Chinese  languages.  Pay  Grade  in  combination  with  FLAT 
provides  the  greatest  improvement  with  the  predictors  investigated  in  this 
study,  i.e.,  a  gain  of  9  correlation  points  over  the  validity  of  FLAT  alone. 
However,  to  eliminate  potential  students  on  the  basis  of  Pay  Grade  may 
conflict  with  more  essential  needs  of  the  service  and  prove  to  be  impractical. 
In  this  case,  provided  that  ratings  are  introduced  as  an  operational  selector 
for  language  school,  then  equation  5,  identical  to  equations  1  and  3,  would 
raise  the  validity  from  .45  for  FLAT  alone  to  .49  for  the  composite. 

In  order  to  illustrate  the  practical  effects  of  the  various  predictors 
in  terms  of  student  achievement,  separate  analyses  were  directed  toward 
identifying  students  who  graduated  in  the  upper  half  of  their  classes. 

Using  a  variety  of  possible  selection  cut-offs  on  FLAT  and  the  composite 
predictors,  i.e.,  those  scoring  in  the  upper  20  per  cent,  upper  40  per  cent, 
upper  60  per  cent,  and  upper  80  per  cent,  the  percentage  of  "top  half" 
students  was  computed  (see  Table  2).  For  example,  with  the  Western  language 
group,  if  selection  were  limited  to  the  upper  20  per  cent  with  respect  to 
the  composite  predictor,  which  includes  instructor  ratings,  29  per  cent  more 
"top  half"  graduates  could  be  expected  when  the  composite  is  used  than  when 
FLAT  is  used  as  a  single  predictor.  For  the  Western  and  Eastern  languages, 
using  any  of  the  four  cut-offs,  the  composite  predictor  is  as  good  or  better 
than  FLAT  alone.  For  the  Indo-Chinese  languages,  however,  some  comparisons 
do  not  favor  the  composite  predictors  over  the  use  of  FLAT  alone,  as 
indicated  by  negative  increments. 

2.  Replication  of  Relevant  DLIWC  Findings  on  DLIEC  Data 

Due  to  the  relatively  small  input  to  DLIEC,  data  collection  was  limited 
to  a  sample  of  129  subjects.  When  categorized  by  languages,  it  became 
apparent  that  only  the  Indo-European  (Western)  language  group  (N=75)  was  , 
sufficiently  large  to  permit  meaningful  analysis.  Validities  for  predicting 
foreign  language  achievement,  as  measured  by  FCS,  were  computed  for  each 
available  predictor  and  are  presented  in  Appendix  Table  A. 

In  addition  to  the  first  week  ratings,  each  student  was  rated  by  an 
instructor  on  each  of  the  three  scales  ("success,"  "oral  production,"  and 
"motivation"),  after  having  been  observed  for  one  full  day.  A  comparison 
is  made  in  Table  3  between  the  validity  of  first  day  and  first  week  ratings 
since,  for  selection  purposes,  a  one  day  rather  than  a  one  week  rating 
would  be  preferred.  Although  the  comparison  between  first  day  and  first 
week  ratings  is  available  only  on  this  relatively  small  sample,  the  results 
are  favorable  to  replacing  first  week  with  first  day  ratings.  Even  though 
the  validity  of  the  "success"  rating  scale  is  lowered  from  .71  based  on  a 
first  week  estimate,  to  .51  for  the  first  day  rating,  and  from  .67  to  .35 
for  the  "oral  production"  rating,  respectively,  these  first  day  validities 
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Percentage  Expected  in  Top  Half  of  Class  for 
Various  Language  Groups 
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TABLE  3 


Comparison  of  the  Validities  Obtained  Through  the  Combined 
Use  of  FLAT  With  a  Rating  Scale  Versus 
Using  FLAT  Only 
(N=75) 


Separate  Increase  of 

Composite  Composite  Validity  Composite  Over 

(FLAT  §  Rating)  Validity  FLAT  Rating  FLAT  Alone 


flat 

1st  Day  "success" 

.55 

.29 

.51 

.26** 

FLAT 

1st  Day  "oral" 

.38 

.29 

.35 

.09* 

FLAT 

1st  Week  "success" 

.71 

.29 

.71 

.42** 

FLAT 

1st  Week  "oral" 

.68 

.29 

.67 

.39** 

Notes  -- 

*Significant  beyond  the  .05  level. 

**Significant  beyond  the  .01  level. 

Single  variable  coefficients  are  Pearson  r's  and  are  presented 
as  positive  for  purposes  of  comparing  with  mulFiple  R's. 


still  exceed  the  correlation  of  .29  between  FLAT  and  the  FCS  criterion.  If 
the  instructors  had  designed  their  initial  lessons  to  facilitate  student 
selection,  presumably  even  higher  validities  could  be  obtained. 

For  purposes  of  indicating  any  increase  in  prediction  possible  through 
the  combined  use  of  ratings  with  FLAT,  multiple  correlations  were  computed 
for  each  of  the  potentially  useful  ratings  obtained  at  DLIEC.  These 
composite  validities  are  presented  with  FLAT  validities  for  comparison  in 
Table  3.  Varying  increments  in  validity  are  indicated  in  Table  3,  depending 
upon  which  rating  scale  is  being  considered,  with  three  of  the  four  increases 
being  significant  beyond  the  one  per  cent  level.  For  both  first  day  and 
first  week  ratings,  the  largest  contribution  to  validity  is  from  the 
"success"  rating,  which  is  based  on  an  instructor's  estimate  of  each 
student's  probable  degree  of  language  success. 


9 


In  addition  to  validating  first  day  instructor's  ratings,  the  relevant 
DLIWC  findings  were  replicated  on  the  DLIEC  data.  The  recommended  predictor 
composites  resulting  from  the  DLIWC  data  are  presented  in  equations  1  and 
2  of  Table  4.  For  the  first  equation,  a  validity  of  .62  was  obtained  on 
the  DLIEC  sample,  a  considerable  improvement  over  the  validity  obtained 
using  FLAT  alone.  Due  to  the  unavailability  of  the  Education  Level 
information  and  NAL  scores  at  DLIEC,  it  was  not  possible  to  cross-validate 
the  second  equation.  However,  in  an  attempt  to  provide  an  alternate 
selection  procedure  if  ratings  cannot  be  used,  weights  were  determined  on 
the  one-third  DLIWC  sample  for  the  combined  use  of  FLAT  and  the  PDQ  and 
cross-validated  on  both  the  two-thirds  DLIWC  sample  and  the  DLIEC  sample. 
Applying  regression  equation  3  resulted  in  a  significant  increase  for  both 
samples  in  the  Indo-European  (Western)  languages  over  the  use  of  FLAT  alone. 


TABLE  4 

Cross-validation  on  Data  From  DLIEC  and  DLIWC  of  Weights 
Determined  on  a  Portion  of  DLIWC  Data  for  The 
Prediction  of  Final  Class  Standing 


Regression  Weights 
and  Variables 

Validities 

DLIWC 

DLIEC 

in  Equation 

Composite 

FLAT 

Composite 

FLAT 

With  Rating 

1.  -1.165  (1st  Week  "Oral"  Rating) 

-0.714 (FLAT) 

.70^ 

.27^ 

.66" 

.29" 

Without  Rating 

2.  -4 . 069 (Education  Level)-0.290 

(PDQ) -0.271 (NAL) 

.56^ 

.42^ 

Not  available 

3.  -0.393(PDQ)-0.655(FLAT) 

.67^ 

.42^ 

.53" 

.29" 

Notes  -- 


Based  on  one-third  sample  (N=66) . 

^Based  on  two-thirds  sample  (N=139) . 
Based  on  total  sample  (N=75) . 
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Table  5  is  an  expectancy  chart  analogous  to  Table  2,  designed  to  permit 
comparison  between  the  percentages  of  expected  "top  half"  students  if 
selected  on  the  basis  of  two  composite  scores  or  on  the  FLAT  score  alone. 

As  an  example,  if  selection  were  limited  to  the  upper  20  per  cent  with 
respect  to  the  predictor,  14,  13,  and  9  per  cent  additional  above-average 
students  could  be  expected,  depending  upon  which  composite  is  used  as 
opposed  to  using  FLAT  alone. 

3.  Recommendations  for  Selection 

Foreign  language  achievement  in  an  intensive  language  training  course 
can  be  predicted  with  greater  accuracy  than  is  presently  possible  using 
FLAT  alone.  The  following  recommended  selection  procedures  are  optimal  for 
samples  similar  to  the  ones  analyzed  in  the  present  study,  i.e.,  heterogeneous 
samnles  composed  of  members  from  the  various  branches  of  service: 

a.  If  instructor  ratings  are  obtained  for  use  in  selecting  students, 
the  recommended  equations  differ,  depending  on  whether  first  week  or  first 
day  ratings  are  used. 

(1)  The  FLAT  score  combined  with  an  instructor's  first  week  rating 
results  in  improved  prediction  of  foreign  language  achievement  for  the  Indo- 
European  Western  and  Eastern  groups  over  that  provided  by  the  FLAT  alone. 

Only  slight  improvement  was  found  for  the  Indo-Chinese  languages  with  this 
multiple  and,  therefore,  it  is  recommended  for  use  only  when  a  uniform 
selection  procedure  seems  advantageous  or  if  more  essential  needs  of  the 
services  conflict  with  the  specific  recommendation  outlined  in  b(3)  below. 

(2)  Results  obtained  on  a  small  sample  of  DLIEC  students  on  one 
language  group  indicate  that  a  composite  of  first  day  ratings  and  FLAT  is 
not  as  effective  as  that  obtained  with  first  week  rating  and  FLAT.  However, 
a  significant  increase  in  validity  is  possible  over  that  obtained  with  FLAT 
alone.  Since  it  would  be  more  feasible  for  selection  use  to  obtain  first 
day  rather  than  first  week  ratings,  further  research  is  recommended  to  assess 
the  validity  of  first  day  ratings  for  a  much  larger  sample  composed  of  all 
language  groups. 

b.  If  it  is  not  economical  to  obtain  and  use  instructor  ratings  for 
selection,  alternative  equations  are  presented: 

(1)  For  the  Indo-European  (Western)  languages,  findings  based  on 
samples  from  both  the  DLIWC  and  DLIEC  schools  suggest  the  use  of  a  weighted 
combination  of  FLAT  and  the  PDQ. 

(2)  For  the  Indo-European  (Eastern)  group,  a  weighted  combination 
of  the  FLAT  and  the  Vocabulary  Learning  Test  is  recommended. 

(3) ,  For  the  Indo-Chinese  group,  a  weighted  combination  of  Pay  Grade 
and  FLAT  improves  the  prediction  of  success  in  foreign  language  training. 

This  combination  is  recommended  for  use  provided  that  selecting  only  men 
from  the  higher  pay  grades  does  not  conflict  with  more  essential  needs  of  the 
services . 
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Percentage  Expected  in  Top  Half  of  Class  for 
Western  Language  Group 


12. 


c.  Since  the  differential  prediction  resulted  from  the  grouping  of 
languages,  it  is  expected  that  similar  results  would  be  found  for  individual 
languages.  The  differential  prediction  of  language  achievement  for  a  specific 
language  was  originally  planned,  but  operational  restrictions  at  DLIWC  did 
not  permit  time  for  sufficient  students  to  be  tested  for  this  research. 

Thus,  it  is  felt  that  prediction  could  be  improved  even  further  if 
sufficient  data  were  available  for  analyses  of  the  more  widely  studied 
individual  languages,  such  as  French,  German,  Russian,  or  Chinese. 

4.  Limitations 


Since  the  personnel  needs  of  the  various  branches  of  service  differ,  FLAT 
had  not  been  applied  in  a  uniform  manner  for  selection  of  students  in  classes 
used  in  the  present  study.  Consequently,  the  resulting  regression  equations 
probably  do  not  predict  equally  well  for  all  service  branches.  They  do, 
however,  demonstrate  the  magnitude  of  increased  validity  possible  with  the 
additional  predictors.  The  optimal  equations  for  each  service  could,  of 
course,  be  constructed  only  from  sizeable  samples  from  each  service.  Since 
this  was  not  possible  with  the  existing  data,  the  prediction  equations 
represent  a  necessary  compromise,  and  again,  indicate  potential  gains  in 
predictive  efficiency. 

It  should  also  be  noted  that  the  instructor  ratings  are  probably 
underestimated  as  to  validity,  since  they  were  not  gathered  with  the  express 
purpose  of  facilitating  student  selection. 

G.  CONCLUSIONS  AND  RECOMMENDATIONS 


One  of  the  major  findings  of  this  research  is  that  the  foreign  language 
achievement  of  military  trainees  may  be  predicted  with  substantial  accuracy 
using  the  predictors  examined  in  this  study.  If  selection  were  based  on  only 
paper  and  pencil  tests  such  as  the  Foreign  Language  Aptitude  Test  (FLAT),  or 
the  Personal  Data  Questionnaire  (PDQ) ,  improvement  may  be  achieved  with  less 
than  two  or  three  hours  of  testing.  Thus,  if  ample  potential  trainees  are 
available,  only  the  most  promising  of  a  group  of  potential  trainees  need  be 
selected  for  the  Defense  Language  Institute. 

Another  major  finding  of  this  research  is  that  trainee  language 
proficiency  at  the  end  of  the  course  of  instruction  can  be  fairly  readily 
predicted  by  an  instructor  at  the  end  of  only  one  week  of  instruction--or 
even  one  day  if  need  be. 

The  improvement  in  selection  obtained  by  paper  and  pencil  tests  can 
itself  be  improved  upon  by  using  both  tests  and  ratings.  This  procedure 
is  recommended.  It  seems  advisable  to  develop  means  for  permitting  language 
instructors  to  screen  potential  trainees  through  an  intensive  period  of 
language  training  of  perhaps  two  days  duration.  This  could  be  accomplished 
through  instructor  travel  to  training  centers,  or  by  sending  the  trainees 
to  the  language  school  for  a  short  trial  training  session.  It  is  believed 
that  the  cost  of  such  travel  could  be  offset  by  the  savings  in  training 
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expenses  if  these  selection  procedures  were  instituted.  It  seems  likely 
that  screening  out  the  potential  trainees  with  the  lowest  probability  of 
success  would  permit  the  remainder  of  the  group  to  complete  training  at 
markedly  lower  cost  in  time  and  dollars  per  graduate  with  considerably 
greater  language  proficiency  on  graduation  as  a  bonus.  Investigation  of 
the  operational  feasibility  of  this  procedure  is  recommended. 

The  selection  of  more  promising  foreign  language  trainees  is  possible 
if  either  of  the  two  previously  outlined  procedures  (i.e.,  paper  and 
pencil  tests  alone  or  combined  with  an  instructor's  rating)  are  established 
for  operational  use.  Either  procedure  will  require  further  research  on  a 
sufficiently  large  sample  consisting  of  Navy  personnel  only  to  establish 
exact  weights  and  cutting  scores  to  be  used. 
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Validities  for  the  Prediction  of  PCS  in  Key  Construction  and  Cross-validation  Samples 
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^Since  these  samples  were  used  to  construct  the  PDQ  and  NAL  keys,  significance  levels  are  not  appropriate  and  are  therefore  excluded 


Multiple-Regression  Coefficients  for  Predicting  PCS  Based  on  Cross-validation  Samples 
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